Requirements#

Let’s understand the functional and non-functional requirements below:

Functional requirements#

A user should be able to perform the following functionalities:

  • Questions and answers: Users can ask questions and give answers. Questions and answers can include images and videos.
  • Upvote/downvote and comment: It is possible for users to upvote, downvote, and comment on answers.
  • Search: Users should have a search feature to find questions already asked on the platform by other users.
  • Recommendation system: A user can view their feed, which includes topics they’re interested in. The feed can also include questions that need answers or answers that interest the reader. The system should facilitate user discovery with a recommender system.
  • Ranking answers: We enhance user experience by ranking answers according to their usefulness. The most helpful answer will be ranked highest and listed at the top.
Functional and non-functional requirements of Quora

Non-functional requirements#

  • Scalability: The system should scale well as the number of features and users grow with time. It means that the performance and usability should not be impacted by an increasing number of users.
  • Consistency: The design should ensure that different users’ views of the same content should be consistent. In particular, critical content like questions and answers should be the same for any collection of viewers. However, it is not necessary that all users of Quora see a newly posted question, answer, or comment right away.
  • Availability: The system should have high availability. This applies to cases where servers receive a large number of concurrent requests.
  • Performance: The system should provide a smooth experience to the user without a noticeable delay.

Resource estimation#

In this section, we’ll make an estimate about the resource requirements for Quora service. We’ll make assumptions to get a practical and tractable estimate. We’ll estimate the number of servers, the storage, and the bandwidth required to facilitate a large number of users.

Assumptions: It is important to base our estimation on some underlying assumptions. We, therefore, assume the following:

  • There are a total of 1 billion users, out of which 300 million are daily active users.
  • Assume 15% of questions have an image, and 5% of questions have a video embedded in them. A question cannot have both at the same time.
  • We’ll assume an image is estimated to be 250 KBs, and a video is considered 5 MBs.

Number of servers estimation#

Let’s estimate our requests per second (RPS) for our design. If there are an average of 300 million daily active users and each user can generate 20 requests per day, then the total number of requests in a day will be:

300×106×20=6×109300 \times 10^6 \times 20 = 6 \times 10^9

Therefore, the RPS = 6×1098640069500\frac{6 \times 10^9}{86400} \approx 69500 approximately requests per second.

Estimating RPS

Daily active users300million
Requests per day per user20
Requests Per Second (RPS)f69444

We already established in the back-of-the-envelope calculations chapter that we’ll use the following formula to estimate a pragmatic number of servers:

Number of daily active usersRPS of a server=300×1068000=37500\frac{Number\ of\ daily\ active\ users}{RPS\ of\ a\ server} = \frac{300 \times 10^6}{8000} = 37500

The estimated number of servers required for Quora

Therefore, the total number of servers required to facilitate 300 million users generating an average of 69,500 requests per second will be 37,500.

Storage estimation#

Let’s keep in mind our assumption that 15% of questions have images and 5% have videos. So, we’ll make the following assumptions to estimate the storage requirements for our design:

  • Each of the 300 million active users posts 1 question in a day, and each question has 2 responses on average, 10 upvotes, and 5 comments in total.
  • The collective storage required for the textual content of one question equals 1 KB1\ KB.

Storage Requirements Estimation Calculator

Questions per user1per day
Total questions per dayf300millions
Size of textual content per question1KB
Image size250KB
Video size5 MB
Questions containing images15percent
Questions containing videos5percent
Storage for textual contentf0.3TB
Storage for image contentf11.25TB
Storage for video contentf75TB
Summarizing storage requirements of Quora

Total storage required for one day = 0.3TB+11.25TB+75TB =86.55TB0.3TB + 11.25TB + 75TB ~= 86.55TB per day

The daily storage requirements of Quora seem very high. But for service with 300 million DAU, a yearly requirement of 86.55TB×365=31.6PB86.55TB \times 365 = 31.6PB is feasible. The practical requirement will be much higher because we have disregarded the storage required for a number of things. For example, non-active (out of 1B1B) users’ data will require storage.

Bandwidth estimation#

The bandwidth estimate requires the calculation of incoming and outgoing data through the network.

  • Incoming traffic: The incoming traffic bandwidth required per day will be equal to 86.55TB86400×8=8Gbps\frac{86.55 TB}{86400} \times 8 = 8Gbps
  • Outgoing traffic: We have assumed that 300 million active users views 20 questions per day, so the total bandwidth requirements can be found in the below calculator:

Bandwidth Requirements Estimation Calculator

Total storage required per day86.55TB
Incoming traffic bandwidthf8Gbps
Questions viewed per user20per day
Total questions viewedf69444per second
Bandwidth for text of all questionsf0.56Gbps
Bandwidth for 15% of image contentf20.83Gbps
Bandwidth for 5% of video contentf138.89Gbps
Outgoing traffic bandwidthf160.3Gbps
Summarizing the bandwidth requirements of Quora

Total bandwidth requirement of Quora is equal to:

Incoming + outgoing traffic bandwidth =8Gbps+160.3Gbps=168.3Gbps= 8Gbps + 160.3Gbps = 168.3Gbps

Building blocks we will use#

We’ll use the following building blocks for the initial design of Quora:

Building blocks required for our design
  • Load balancers will be used to divide the traffic load among the service hosts.
  • Databases are essential for storing all sorts of data, such as user questions and answers, comments, and likes and dislikes. Also, user data will be stored in the databases. We may use different types of databases to store different data.
  • A distributed caching system will be used to store frequently accessed data. We can also use caching to store our view counters for different questions.
  • The blob store will keep images and video files.

System Design: Quora
Initial Design of Quora
Mark as Completed
Report an Issue